146 research outputs found

    Learning the Structural Vocabulary of a Network

    Get PDF
    Networks have become instrumental in deciphering how information is processed and transferred within systems in almost every scientific field today. Nearly all network analyses, however, have relied on humans to devise structural features of networks believed to be most discriminative for an application. We present a framework for comparing and classifying networks without human-crafted features using deep learning. After training, autoencoders contain hidden units that encode a robust structural vocabulary for succinctly describing graphs. We use this feature vocabulary to tackle several network mining problems and find improved predictive performance versus many popular features used today. These problems include uncovering growth mechanisms driving the evolution of networks, predicting protein network fragility, and identifying environmental niches for metabolic networks. Deep learning offers a principled approach for mining complex networks and tackling graph-theoretic problems

    Evidence of Rentian Scaling of Functional Modules in Diverse Biological Networks

    Get PDF
    Biological networks have long been known to be modular, containing sets of nodes that are highly connected internally. Less emphasis, however, has been placed on understanding how intermodule connections are distributed within a network. Here, we borrow ideas from engineered circuit design and study Rentian scaling, which states that the number of external connections between nodes in different modules is related to the number of nodes inside the modules by a power-law relationship. We tested this property in a broad class of molecular networks, including protein interaction networks for six species and gene regulatory networks for 41 human and 25 mouse cell types. Using evolutionarily defined modules corresponding to known biological processes in the cell, we found that all networks displayed Rentian scaling with a broad range of exponents. We also found evidence for Rentian scaling in functional modules in the Caenorhabditis elegans neural network, but, interestingly, not in three different social networks, suggesting that this property does not inevitably emerge. To understand how such scaling may have arisen evolutionarily, we derived a new graph model that can generate Rentian networks given a target Rent exponent and a module decomposition as inputs. Overall, our work uncovers a new principle shared by engineered circuits and biological networks

    Branch-pipe: Improving graph skeletonization around branch points in 3D point clouds

    Get PDF
    Modern plant phenotyping requires tools that are robust to noise and missing data, while being able to efficiently process large numbers of plants. Here, we studied the skeletonization of plant architectures from 3D point clouds, which is critical for many downstream tasks, including analyses of plant shape, morphology, and branching angles. Specifically, we developed an algorithm to improve skeletonization at branch points (forks) by leveraging the geometric properties of cylinders around branch points. We tested this algorithm on a diverse set of high-resolution 3D point clouds of tomato and tobacco plants, grown in five environments and across multiple developmental timepoints. Compared to existing methods for 3D skeletonization, our method efficiently and more accurately estimated branching angles even in areas with noisy, missing, or non-uniformly sampled data. Our method is also applicable to inorganic datasets, such as scans of industrial pipes or urban scenes containing networks of complex cylindrical shapes

    The Computational Power of Beeps

    Full text link
    In this paper, we study the quantity of computational resources (state machine states and/or probabilistic transition precision) needed to solve specific problems in a single hop network where nodes communicate using only beeps. We begin by focusing on randomized leader election. We prove a lower bound on the states required to solve this problem with a given error bound, probability precision, and (when relevant) network size lower bound. We then show the bound tight with a matching upper bound. Noting that our optimal upper bound is slow, we describe two faster algorithms that trade some state optimality to gain efficiency. We then turn our attention to more general classes of problems by proving that once you have enough states to solve leader election with a given error bound, you have (within constant factors) enough states to simulate correctly, with this same error bound, a logspace TM with a constant number of unary input tapes: allowing you to solve a large and expressive set of problems. These results identify a key simplicity threshold beyond which useful distributed computation is possible in the beeping model.Comment: Extended abstract to appear in the Proceedings of the International Symposium on Distributed Computing (DISC 2015

    A network-based approach for predicting missing pathway interactions

    Get PDF
    Embedded within large-scale protein interaction networks are signaling pathways that encode response cascades in the cell. Unfortunately, even for well-studied species like S. cerevisiae, only a fraction of all true protein interactions are known, which makes it difficult to reason about the exact flow of signals and the corresponding causal relations in the network. To help address this problem, we introduce a framework for predicting new interactions that aid connectivity between upstream proteins (sources) and downstream transcription factors (targets) of a particular pathway. Our algorithms attempt to globally minimize the distance between sources and targets by finding a small set of shortcut edges to add to the network. Unlike existing algorithms for predicting general protein interactions, by focusing on proteins involved in specific responses our approach homes-in on pathway-consistent interactions. We applied our method to extend pathways in osmotic stress response in yeast and identified several missing interactions, some of which are supported by published reports. We also performed experiments that support a novel interaction not previously reported. Our framework is general and may be applicable to edge prediction problems in other domains

    Unsupervised segmentation of noisy electron microscopy images using salient watersheds and region merging

    Get PDF
    BACKGROUND: Segmenting electron microscopy (EM) images of cellular and subcellular processes in the nervous system is a key step in many bioimaging pipelines involving classification and labeling of ultrastructures. However, fully automated techniques to segment images are often susceptible to noise and heterogeneity in EM images (e.g. different histological preparations, different organisms, different brain regions, etc.). Supervised techniques to address this problem are often helpful but require large sets of training data, which are often difficult to obtain in practice, especially across many conditions. RESULTS: We propose a new, principled unsupervised algorithm to segment EM images using a two-step approach: edge detection via salient watersheds following by robust region merging. We performed experiments to gather EM neuroimages of two organisms (mouse and fruit fly) using different histological preparations and generated manually curated ground-truth segmentations. We compared our algorithm against several state-of-the-art unsupervised segmentation algorithms and found superior performance using two standard measures of under-and over-segmentation error. CONCLUSIONS: Our algorithm is general and may be applicable to other large-scale segmentation problems for bioimages

    Decreasing-Rate Pruning Optimizes the Construction of Efficient and Robust Distributed Networks

    Get PDF
    Robust, efficient, and low-cost networks are advantageous in both biological and engineered systems. During neural network development in the brain, synapses are massively over-produced and then pruned-back over time. This strategy is not commonly used when designing engineered networks, since adding connections that will soon be removed is considered wasteful. Here, we show that for large distributed routing networks, network function is markedly enhanced by hyper-connectivity followed by aggressive pruning and that the global rate of pruning, a developmental parameter not previously studied by experimentalists, plays a critical role in optimizing network structure. We first used high-throughput image analysis techniques to quantify the rate of pruning in the mammalian neocortex across a broad developmental time window and found that the rate is decreasing over time. Based on these results, we analyzed a model of computational routing networks and show using both theoretical analysis and simulations that decreasing rates lead to more robust and efficient networks compared to other rates. We also present an application of this strategy to improve the distributed design of airline networks. Thus, inspiration from neural network formation suggests effective ways to design distributed networks across several domains

    A distributed algorithm to maintain and repair the trail networks of arboreal ants

    Get PDF
    We study how the arboreal turtle ant (Cephalotes goniodontus) solves a fundamental computing problem: maintaining a trail network and finding alternative paths to route around broken links in the network. Turtle ants form a routing backbone of foraging trails linking several nests and temporary food sources. This species travels only in the trees, so their foraging trails are constrained to lie on a natural graph formed by overlapping branches and vines in the tangled canopy. Links between branches, however, can be ephemeral, easily destroyed by wind, rain, or animal movements. Here we report a biologically feasible distributed algorithm, parameterized using field data, that can plausibly describe how turtle ants maintain the routing backbone and find alternative paths to circumvent broken links in the backbone. We validate the ability of this probabilistic algorithm to circumvent simulated breaks in synthetic and real-world networks, and we derive an analytic explanation for why certain features are crucial to improve the algorithm's success. Our proposed algorithm uses fewer computational resources than common distributed graph search algorithms, and thus may be useful in other domains, such as for swarm computing or for coordinating molecular robots

    A neural data structure for novelty detection

    Get PDF
    Novelty detection is a fundamental biological problem that organisms must solve to determine whether a given stimulus departs from those previously experienced. In computer science, this problem is solved efficiently using a data structure called a Bloom filter. We found that the fruit fly olfactory circuit evolved a variant of a Bloom filter to assess the novelty of odors. Compared with a traditional Bloom filter, the fly adjusts novelty responses based on two additional features: the similarity of an odor to previously experienced odors and the time elapsed since the odor was last experienced. We elaborate and validate a framework to predict novelty responses of fruit flies to given pairs of odors. We also translate insights from the fly circuit to develop a class of distance- and time-sensitive Bloom filters that outperform prior filters when evaluated on several biological and computational datasets. Overall, our work illuminates the algorithmic basis of an important neurobiological problem and offers strategies for novelty detection in computational systems

    Reconstruction of Network Evolutionary History from Extant Network Topology and Duplication History

    Full text link
    Genome-wide protein-protein interaction (PPI) data are readily available thanks to recent breakthroughs in biotechnology. However, PPI networks of extant organisms are only snapshots of the network evolution. How to infer the whole evolution history becomes a challenging problem in computational biology. In this paper, we present a likelihood-based approach to inferring network evolution history from the topology of PPI networks and the duplication relationship among the paralogs. Simulations show that our approach outperforms the existing ones in terms of the accuracy of reconstruction. Moreover, the growth parameters of several real PPI networks estimated by our method are more consistent with the ones predicted in literature.Comment: 15 pages, 5 figures, submitted to ISBRA 201
    corecore